Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 0644020210340010037
Journal Of Korean Medical Classics
2021 Volume.34 No. 1 p.37 ~ p.45
Detecting Local Text Reuse in the Texts of East Asian Traditional Medicine
Oh Jun-Ho

Abstract
Objectives : The purpose of this paper was to examine quantitative methods for estimating and detecting local text reuse in the texts of East Asian Traditional Medicine.

Methods: We introduce techniques that estimate the volume of local text reuse with n-gram and those that directly detect the reuse with the Smith-Waterman algorithm (SW algorithm). Based on this, the estimation and detection of local text reuse were carried out for ??Donguibogam?? and ??Huangdineijing¡¤Suwen??.

Results: Estimates with n-gram had more errors than methods with SW algorithms. SW algorithms detected suspected strings directly with local text reuse, resulting in more accurate results.

Conclusions: Although n-gram does not accurately find local text reuse, its high speed makes it a preferable method for certain purposes, such as screening similar documents. On the other hand, SW algorithms have the advantage of being relatively good at finding similar phrases suspected as local text reuse even if the strings do not completely match. However, due to its excessive consumption of time and computing resources, its benefits are limited to cases where precise results are required.
KEYWORD
Text Reuse, n-gram, Smith-Waterman Algorithm, Korean Medical Classics, East Asian traditional medicine
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI)